Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

نویسندگان

  • RAGHAV PANT
  • THEODORE B. TRAFALIS
  • KASH BARKER
چکیده

In this paper, we have developed a robust Support Vector Machines (SVM) scheme of classifying imbalanced and noisy data using the principles of Robust Optimization. Uncertainty is prevalent in almost all datasets and has not been addressed efficiently by most data mining techniques, as these are based on deterministic mathematical tools. Imbalanced datasets exist while performing analysis of rare events, and for such datasets elements in the minority class become critical. Our method tries to address both issues lacking in traditional SVM classifications. At present, we provide solutions for linear classification of data having bounded uncertainties. This can be extended to non-linear classification schemes for any types of uncertainties that are convex. Our results in predicting the importance of the minority class are better than the traditional SVM soft-margin classification. Preliminary computational results are presented. Key-Words: Support Vector Machines, Robust Classification, Imbalance, Uncertainty, Noise

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robustified distance based fuzzy membership function for support vector machine classification

Fuzzification of support vector machine has been utilized to deal with outlier and noise problem. This importance is achieved, by the means of fuzzy membership function, which is generally built based on the distance of the points to the class centroid. The focus of this research is twofold. Firstly, by taking the advantage of robust statistics in the fuzzy SVM, more emphasis on reducing the im...

متن کامل

Outlier Detection for Support Vector Machine using Minimum Covariance Determinant Estimator

The purpose of this paper is to identify the effective points on the performance of one of the important algorithm of data mining namely support vector machine. The final classification decision has been made based on the small portion of data called support vectors. So, existence of the atypical observations in the aforementioned points, will result in deviation from the correct decision. Thus...

متن کامل

Robust Cost Sensitive Support Vector Machine

In this paper we consider robust classifications and show equivalence between the regularized classifications. In general, robust classifications are used to create a classifier robust to data by taking into account the uncertainty of the data. Our result shows that regularized classifications inherit robustness and provide reason on why some regularized classifications tend to be robust agains...

متن کامل

Support Vector Machine Classifiers with Uncertain Knowledge Sets via Robust Optimization

In this paper we study Support Vector Machine(SVM) classifiers in the face of uncertain knowledge sets and show how data uncertainty in knowledge sets can be treated in SVM classification by employing robust optimization. We present knowledge-based SVM classifiers with uncertain knowledge sets using convex quadratic optimization duality. We show that the knowledge-based SVM, where prior knowled...

متن کامل

Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine

We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011